Add image query support to the backend microservices#12
Add image query support to the backend microservices#12dmsuehir merged 39 commits intommqna-image-queryfrom
Conversation
Signed-off-by: dmsuehir <[email protected]>
Signed-off-by: dmsuehir <[email protected]>
Signed-off-by: dmsuehir <[email protected]>
Signed-off-by: dmsuehir <[email protected]>
Signed-off-by: dmsuehir <[email protected]>
Signed-off-by: okhleif-IL <[email protected]> * added in audio dict creation Signed-off-by: okhleif-IL <[email protected]> * separated audio from prompt Signed-off-by: okhleif-IL <[email protected]> * added ASR endpoint Signed-off-by: okhleif-IL <[email protected]> * removed ASR endpoints from mm embedding Signed-off-by: okhleif-IL <[email protected]> * edited return logic, fixed function call Signed-off-by: okhleif-IL <[email protected]> * added megaservice to elif Signed-off-by: okhleif-IL <[email protected]> * reworked helper func Signed-off-by: okhleif-IL <[email protected]> * Append audio to prompt Signed-off-by: okhleif-IL <[email protected]> * Reworked handle messages, added metadata Signed-off-by: okhleif-IL <[email protected]> * Moved dictionary logic to right place Signed-off-by: okhleif-IL <[email protected]> * changed logic to rely on message len Signed-off-by: okhleif-IL <[email protected]> * list --> empty str Signed-off-by: okhleif-IL <[email protected]> --------- Signed-off-by: Melanie Buehler <[email protected]> Signed-off-by: okhleif-IL <[email protected]> Signed-off-by: dmsuehir <[email protected]>
…ina/image_query Signed-off-by: dmsuehir <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: okhleif-IL <[email protected]>
Signed-off-by: dmsuehir <[email protected]>
Signed-off-by: okhleif-IL <[email protected]>
Fixed role bug where enumeration was wrong
for more information, see https://pre-commit.ci
Signed-off-by: dmsuehir <[email protected]>
…nto dina/image_query Signed-off-by: dmsuehir <[email protected]>
Signed-off-by: dmsuehir <[email protected]>
Signed-off-by: Melanie Buehler <[email protected]>
Signed-off-by: dmsuehir <[email protected]>
Signed-off-by: dmsuehir <[email protected]>
| else: | ||
| print("request is from user.") | ||
| text = req_dict["prompt"] | ||
| text = f"<image>\nUSER: {text}\nASSISTANT:" |
There was a problem hiding this comment.
It used to be that the LVM microservice would always add <image>\n and USER: to the beginning of the prompt that was created by the gateway, however I had to change that since images can now to scattered throughout the conversation (we can no longer just attach the single image to the first message in the conversation). I updated this mock of the LVM microservice to do something similar to what the actual LVM service does where it checks how many image tags already exist in the prompt, and adds extras if they are needed. The reason why we might need extras are when the retriever also gets an image from the vector store.
| ) | ||
| # print(result_dict) | ||
| self.assertEqual(result_dict[self.lvm.name]["text"], "<image>\nUSER: chao, \nASSISTANT:") | ||
| self.assertEqual(result_dict[self.lvm.name]["text"], "USER: <image>\nchao, \nASSISTANT:") |
There was a problem hiding this comment.
Previously, the prompts had <image>\nUSER: but the prompt format documented by huggingface is the other way around USER: <image>\n and also since we will now have images interleaved in the conversation, I changed it to match the HF docs.
| formats than the default LLaVA 1.5 model. | ||
| """ | ||
|
|
||
| # Models to test and their expected prompts |
There was a problem hiding this comment.
Normally I would've parameterized this test with the pytest decorator, however it seems like that won't work in this case because of them subclassing unittest.IsolatedAsyncioTestCase. Apparently there's another library called parameterized that could do it, but the unittest container that this is run in doesn't have that dependency and it's easier to avoid adding a third party dependency (stackoverflow reference for the issue).
So, instead I just have lists of the models and their expected prompt format, and I'm looping those to do the checks.
Signed-off-by: Melanie Buehler <[email protected]>
Signed-off-by: Melanie Buehler <[email protected]>
Adds unit test coverage for audio query
for more information, see https://pre-commit.ci
Signed-off-by: Melanie Buehler <[email protected]>
Fix port number placement
okhleif-10
left a comment
There was a problem hiding this comment.
LGTM, do all tests pass?
| # Multimodal RAG QnA With Videos has not yet accepts image as input during QnA. | ||
| num_messages = len(data["messages"]) if isinstance(data["messages"], list) else 1 | ||
|
|
||
| # Multimodal RAG QnA With Videos has not yet accepts image as input during QnA. |
There was a problem hiding this comment.
Can you remove this comment or update it for accuracy?
| self.assertEqual(len(b64_types["image"]), 2) | ||
| finally: | ||
| test_gateway.stop() | ||
|
|
Signed-off-by: dmsuehir <[email protected]>
Signed-off-by: dmsuehir <[email protected]>
Signed-off-by: dmsuehir <[email protected]>
|
I rebased and synced this branch with |
|
|
||
| if echo "$CONTENT" | grep -q "retrieved_docs"; then | ||
| echo "[ retriever ] Content has retrieved_docs as expected." | ||
| if echo "$CONTENT" | grep -q "retrieved_docs"; then |
There was a problem hiding this comment.
Is this grep supposed to look for img_b64_str?
There was a problem hiding this comment.
Yes, good point. I will fix this.
Signed-off-by: dmsuehir <[email protected]>
| logflag = os.getenv("LOGFLAG", False) | ||
|
|
||
| # The maximum number of images that should be sent to the LVM | ||
| max_images = int(os.getenv("MAX_IMAGES", 1)) |
There was a problem hiding this comment.
In this line, is 1 being set manually? Or is it a default?
There was a problem hiding this comment.
This means that if MAX_IMAGES is unset, it will default to 1
Description
Updates the following microservices:
Issues
https://github.com/opea-project/docs/blob/main/community/rfcs/24-10-02-GenAIExamples-001-Image_and_Audio_Support_in_MultimodalQnA.md
Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
No new dependencies
Tests
Tests have been updated